Crossed-Time Delay Neural Network for Speaker Recognition

نویسندگان

چکیده

Time Delay Neural Network (TDNN) is a well-performing structure for deep neural network-based speaker recognition systems. In this paper we introduce novel structure, named Crossed-Time (CTDNN) to enhance the performance of current TDNN recognition. Inspired by multi-filters setting convolution layers from networks, set multiple time delay units with different context size at bottom layer and construct multilayer parallel network. The proposed CTDNN gives significant improvements over original on both verification identification tasks. It outperforms in VoxCeleb1 dataset experiment 2.6% absolute Equal Error Rate improvement. few shots condition, reaches 90.4% accuracy, which doubles accuracy TDNN. We also compare another new variant TDNN, Factorized-TDNN, shows that our model has 36% improvement under condition. Moreover, can handle training larger batch more efficiently hence, utilize calculation resources economically.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker-independent connected letter recognition with a multi-state time delay neural network

We present a Multi-State Time Del ay Neural Network (MS-TDNN) for speaker-i ndependent, connected l etter recogni ti on. Our MS-TDNNachi eves 98. 5/92.0% word accuracy on speaker dependent/i ndependent Engl i sh l etter tasks[7, 8]. In thi s paper we wi l l summari ze several techni ques to improve (a) conti nuous recogni ti on performance, such as sentence l evel trai ni ng, and (b) phoneti c ...

متن کامل

A Time Delay Neural Network for Online Arabic Handwriting Recognition

Handwriting recognition is an interesting part in pattern recognition field. In the last decade, several approaches are focused on online handwriting recognition because the very rapid growth of new technologies in the field of data entry. In this paper, we propose a new system for online Arabic handwriting recognition based on beta-elliptic model which allow to segment the trajectory into segm...

متن کامل

A time-delay neural network architecture for isolated word recognition

-A translation-invariant back-propagation network is described that performs better than a soph&ticated continuous acoustic parameter hidden Markov model on a noisy, lO0-speaker confusable vocabulary isolated word recognition task. The network's replicated architecture permits it to extract precise information from unaligned training patterns selected by a naive segmentation rule. Keywords--Iso...

متن کامل

Multi-Speaker/speaker-Independent Architectures For The Multi-State Time Delay Neural Network

no way to use (say) the vowels of one ISM and the consonants of another ISM. In a more exible tuning-in scheme, an individual speaker-mixture can be selected for each phoneme independently, conceptionelly similar to the speaker-adaptive phoneme models in [7]. With this approach, the performance of the new speakers on the test set improved from 95.4% to 96.5% (excerpted words) with supervised tu...

متن کامل

AN IMPROVED CONTROLLED CHAOTIC NEURAL NETWORK FOR PATTERN RECOGNITION

A sigmoid function is necessary for creation a chaotic neural network (CNN). In this paper, a new function for CNN is proposed that it can increase the speed of convergence. In the proposed method, we use a novel signal for controlling chaos. Both the theory analysis and computer simulation results show that the performance of CNN can be improved remarkably by using our method. By means of this...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2021

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-030-67832-6_1